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ABSTRACT 

Next generation radio observatories such as the MWA, LWA, LOFAR, CARMA and 
SKA provide a number of challenges for interferometric data analysis. These chal- 
lenges include heterogeneous arrays, direction-dependent instrumental gain, and re- 
fractive and scintillating atmospheric conditions. From the analysis perspective, this 
means that calibration solutions can not be described using a single complex gain per 
antenna. In this paper we use the optimal map-making formalism developed for CMB 
analyses to extend traditional interferometric radio analysis techniques — removing the 
assumption of a single complex gain per antenna and allowing more complete descrip- 
tions of the instrumental and atmospheric conditions. Due to the similarity with holo- 
graphic mapping of radio antenna surfaces, we call this extended analysis approach 
software holography. The resulting analysis algorithms are computationally efficient, 
unbiased, and optimally sensitive. We show how software holography can be used to 
solve some of the challenges of next generation observations, and how more familiar 
analysis techniques can be derived as limiting cases. 



1 INTRODUCTION 

Motivated by the requirements of next generation radio ob- 
servatories, we examine an alternative approach for for- 
mulating optimal radio analyses. The CMB optimal map- 
maicing (OMM) formalism provides an elegant way to trans- 
late from a mathematical description of the measurement to 
provably optimal analyses (Tegmark 1997a,b), and under- 
lies most current CMB observations. In this paper we use 
this optimal map-making formalism to describe the inter- 
ferometric data analysis problem, and show how many of 
the issues faced by next generation arrays can be naturally 
incorporated into this framework. 

After briefly introducing optimal map making and how 
it can be used with interferometric data in §2, wc use this 
framework to expand the mathematical descriptions to in- 
clude direction-dependent antenna response (§3), heteroge- 
neous arrays (§3.1), wideficld refractive atmospheric distor- 
tions (§4.1), and scintillating distortions (§4.2). We then con- 
clude in §5 with a discussion of wideficld efi'ccts and com- 
paring this analysis approach with faceting and other com- 
monly used techniques for analyzing interferometric data 
with direction-dependent calibration and distortion. 

In a recent paper Bliatnagar ct al. (2008) detail a new 
analysis for data with direction-dependent antenna gains 
which is functionally identical to the analysis we develop 
in §3.1. While these papers were developed independently, 
we believe they should be considered as a complimentary 
pair — Bhatnagar et al. demonstrate increased fidelity in the 
context of traditional radio astronomy software, while we 
provide a theoretical foundation for the software hologra- 



phy technique and extend it to a number of other problems 
facing next generation interferometric arrays. Specifically, in 
this paper we will: 

• Place the work of Bhatnagar et al. (2008) on a firm 
theoretical foundation. 

• Extend the ideas of software holography to refractive 
and scintillating atmospheric distortions. 

• Provide a first step towards using CMB deconvolu- 
tion techniques with interferometric data, enabling high- 
precision statistical measurements such as 21 cm Epoch of 
Reionization power spectrum measurements. 



2 OPTIMAL MAP MAKING 

In this section wc briefly introduce the optimal map-making 
formalism (OMM) that underpins most CMB data analyses 
(Tegmark 1997a,b), then as an example show how this de- 
scription can be used to describe the traditional algorithms 
used in radio astronomy analysis software such as AIPS, 
MIRIAD, and CASA. [Throughout wc use linear algebra 
notation as it allows the expressions to be more compact 
and expressive. Table Al and Appendix A allow the equa- 
tions in this paper to be converted to integral equivalents, 
and Appendix B provides a brief refresher on linear algebra 
notation and concepts.] 

There are two key steps in the OMM method: a math- 
ematical description of the measurement, and the optimal 
reconstruction based on this measurement description. In 
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general, one describes the observation in the following form: 

d = Mx + n (1) 

where d is a vector of the measurements and x is the true 
values one is measuring. M is then a matrix operator that 
describes the measurement process, including all instrumen- 
tal, atmospheric, and data handling effects, and n is detector 
noise with covariance matrix N = nn"^. 

If we assume the measurement can be expressed as lin- 
ear equation of the form in Eq. 1 and the noise is Gaussian 
and uncorrelated with the signal (both are true for radio as- 
tronomy), it can then be proved that the minimally biased 
estimator for x is given by 

x= (M^N"^M)"^M^N"M (2) 

(Tegmark 1997a). Equation 2 can be viewed as consisting of 
two separate parts 

x= [m^N'^mJ^ ^M^N'M^ . (3) 



In the first step the measurements are weighted by their 

signal-to-noiso (including covariant noise) and translated by 
the conjugate transpose of the measurement description M 
back into the coordinates of the input sky. Effectively this 
forms a 'dirty map' at the end of step one. The second part 
of Equation 3 then represents a deconvolution step. While 
the OMM method implies a particular style of deconvolution 
that has been very successful for CMB analysis, one could 
also use CLEAN, MEM or other non-linear deconvolution 
algorithms. 

For our case, we are only concerned with the first part 
of Equation 3, and can thus rewrite the relationship as 

X = D"^ M'^N"M, (4) 

where D"-*^ represents the deconvolution algorithm of the 
reader's choice. What is powerful about the OMM frame- 
work is that the dirty map formed by the first part of Equa- 
tion 3 can be proved to be unbiased, lossless, and efficient 
(Tegmark 1997a). In particular the lossless nature guaran- 
tees that all of the information present in the individual 
measurements d is retained in the 'dirty map' formed at the 
end of part 1.^ Often there is significant data compression in 
forming this lossless intermediate representation. For exam- 
ple, in CMB satellites the huge number of time-ordered-data 
values arc reduced to an unprocessed temperature map, and 
for VLA interferometry the raw visibilities are reduced to a 
u,v grid or a 'dirty map.' As long as our description of the 
measurement (M) is accurate none of the information in the 
raw measurements has been lost in this step. 

2.1 Standard Interferometric Techniques 

It is instructive at this point to write down the standard in- 
terferometric analysis of AIPS, MIRIAD, and CASA using 

^ Lossless here means that all of the sky information that was in 
the visibilities is preserved in the intermediate map. This does not 
mean one can obtain the true sky, as the measurement process 
itself removes a lot of information (incomplete u, v coverage, etc.), 
only that the information content of the measurements has been 
preserved in the analysis. 



the optimal map-making formalism. It is traditional to as- 
sume a small field-of-view when deriving the interferometric 
analysis equations (e.g. Clark (1999) equation 1-8), and we 
make the same assumption here as it simplifies the notation 
in the following sections and helps focus the reader on the 
unique characteristics of the software holography approach. 
However, we realize this is a bit of a strawmaii comparison 
as there are more advanced techniques in general usage. In 
§5 we will return to show how widefield effects can be in- 
cluded in all of the developments presented here and discuss 
how this work is related to the more modern approaches of 
Cornwell & Perley (1992), Sault et al. (1996) and Bhatnagar 
et al. (2008). 

We start with a description of the measurement, for a 
standard mid-frequency observation with an array like the 
VLA: 

m(v) = G(v, v)S(v, u)F(u, e)I{e) + n(v). (5) 

(Please see Appendix A for how to translate this into inte- 
gral notation.) In words the measurement equation takes the 
true sky I{6) as a vector, Fourier transforms this to form the 
true u, V distribution (represented by the two-dimensional 
vector u, sec Table Al), samples the true u,v distribution 
with the baseline distribution of the observation S (a set 
of (5-functions at each baseline location) to form the visi- 
bilities, multiplies by the complex gain G appropriate for 
each visibility (can include both instrumental and atmo- 
spheric/ionospheric effects), and adds the per visibility ther- 
mal noise n (usually assumed to be independent for each 
visibility, but formally cross talk and co-variance can be in- 
cluded) . 

Using this description of the measurement, we can di- 
rectly write down the optimal analysis as 

7(61) =D-^ F^(6i,u)S^(u,v) [g^(v,v)N-'] m(v). (6) 

The analysis is essentially weighting by the noise, applying 
the steps describing the measurement in reverse order (and 
conjugate transposed), and deconvolving. Again describing 
the process: we start with a vector of measured visibilities m, 
weight them by the inverse noise co-variance matrix (high 
noise channels receive less weight) , multiply by the transpose 
of the gain G^ to correct the phase, then grid the visibilities 
to the u, V plane with S'^ and Fourier transform to form a 
dirty map of the sky.^''' 

Equation 6 is the traditional analysis as implemented in 
several current interferometric software packages. Often the 
terms in square brackets are combined to form a single Tgyg 
weighting and calibration step using the results of self-cal; 
sometimes the Fourier transform is incorporated into the 
deconvolution step for computational rcEisons; and if a Fast 
Fourier Transform is used anti-aliasing filters must be added. 
However, the fundamental algorithm is the same. This is 

^ Note that the operator argument (u, v) refers to mapping from 
visibilities (v) to the u,v plane (u). Please see Table Al for de- 
tails. 

In some radio software implementations the conjugate recip- 
rocal of the gain is used (1/gJ or as opposed to the gain 
conjugate {g* or G-^) as depicted in Equation 6. This difference is 
usually unimportant in a non-linear CLEAN like algorithm (other 
than the units of the intermediate map) , but the version in Equa- 
tion 6 maximizes the signal-to-noise. 
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reassuring as it has long been known tliat tliis algoritlim is 
optimal if the description of the measurement in Equation 5 
holds. 

The problem encountered by next generation arrays 
is that Equation 5 docs not accurately describe their 
measurement — there are a number of assumptions about 
the measurement embedded in this measurement descrip- 
tion. Describing the gain as a per-visibiUty complex num- 
ber G(v,v) assumes that the gain and phase are uniform 
across the antenna field of view. Several next generation in- 
struments have instrumental gain and phase which varies 
as a function of direction within the field of view, funda- 
mentally breaking this assumption. Similarly atmospheric 
distortions with length scales smaller than the field of view 
cannot be expressed as a single per-bascline complex rmm- 
ber. In addition, the flat sky assumption is incorporated in 
the 6 coordinate system, hampering the analysis of widefield 
observations. 

The remainder of this paper largely consists of rewriting 
Equation 5 to accurately describe the measurements pro- 
posed with next generation arrays, and using the optimal 
map-making formalism to determine the appropriate analy- 
sis methods. Throughout this paper wc assume the antenna 
calibration has been determined separately. In current soft- 
ware the overall antenna and atmospheric delay is usually 
measured via self-cal as part of the deconvolution process, 
but polarimetric calibration is usually determined through 
separate parallactic or holographic observations. Either ap- 
proach to determining the calibration can be used in this 
formalism. 



3 POSITION DEPENDENT CALIBRATION & 
HETEROGENEOUS ARRAYS 

This section concentrates on instrumental calibration when 
the gain and phase vary as a function of direction within the 
field of view. This case is commonly encountered in widefield 
imaging, such as low frequency observations with the VLA 
and all upcoming low frequency arrays such as the MWA, 
LWA, and LOFAR. The additional complication of atmo- 
spheric calibration will be delayed until Section 4. 

We will first assume that the gain pattern is identical 
for all antennas. The standard measurement description in 
Equation 5 can be modified to form 

m(v) = S(v, u)F(u, 6>)B(e, e)I{e) + n(v). (7) 

We have replaced the per baseline gain G(v, v) with a di- 
rection dependent complex power pattern B(^, 6). The beam 
pattern attenuates the signal seen by the interferometer, but 

the remainder of the measurement is unaffected. 

Again following the OMM framework our analysis 
method should be 

7(61) = D"^ B^(6I, e)F'^{e, u)S'^(u, v)N-'m(v). (8) 

This is largely identical to Equation 6, except that we have 
multiplied by the beam transpose B^. In forming this trans- 
pose, we take the complex conjugate of the gain towards 
each sky pixel 6 and reorder the entries. This means that a 
pixel gain of le"*""** becomes |e~"*: the phase is corrected 
but the gain amplitude is applied a second time. The ampli- 
tude of the 'dirty map' formed just before the deconvolution 



is attenuated by the beam shape squared. The signal is at- 
tenuated once in the measurement description (B in Eq. 7), 
and again by the analysis procedure (B^ in Eq. 8). 

While puzzling at first, this beam squared weighting is 
necessary to form an optimal map. The highest signal-to- 
noise is achieved if signals are variance weighted, and the 
measured signal-to-noise is given by the beam pattern. The 
NVSS team uses beam-squared weighting to add overlapping 
maps for exactly this reason (Condon et al. 1998), and it 
is gratifying to sec this result fall out of this derivation. 
Of course the deconvolution algorithm must understand the 
weighting of the dirty map to appropriately reconstruct a 
final image. 



3.1 Heterogeneous Arrays 

The development above assumed that the directional re- 
sponse of all the antennas were identical. For many upcom- 
ing observations the antennas are not identical, either due 
to antenna-to-antenna variation such as the MWA, or due 
to a mix of different antenna types as in CARMA. In this 
section we will remove the assumption of identical antenna 
responses, and follow a slightly more detailed derivation to 
illustrate use of the OMM method. 

The easiest way to extend Equation 7 to heterogeneous 
arrays is to subscript the beam pattern, giving each baseline 
a unique power pattern 

m(v) = S(v, U6)F(U6, eb)Bb{eb, e)I{e) + n(v). (9) 

This equation creates a different observed sky for each an- 
tenna pair 9b, and the remainder of the measurement re- 
mains the same. Unfortunately this is a computationally 
expensive description of the measurement as each of the h 
separate observed skies require a Fourier transform and sam- 
pling, and it leads to a computationally expensive analysis 
algorithm: 

J(6») = B^(e, ObW'^ieb, U5)S^(u6, v)N-'m(v). (10) 

Here each baseline is gridded (single w, v (5-function) and 
Fourier transformed to produce a single fringe, which is then 
attenuated by the power pattern appropriate for that base- 
line before being added to a common dirty map. Concep- 
tually, this is correct. The fringe should only be added to 
portions of the image seen by that antenna pair, and be- 
cause B is complex the location of fringe peaks can shift 
from one portion of the image to another in response to 
direction dependent phase response. 

Fortunately, there is a more efficient way to perform the 
same analysis. Returning to the measurement description in 
Equation 9, we can recast the problem 

m(v) = S(v,Ub)F(u;„6»6)B6(6i6,6l)7(6»)-hn(v), 

m(v) = S(v,U6)B6(u6,u)F(u,6»)/(6')-Mi(v), 

m(v) = B(v,u)F(u,e)/(e)-fn(v). (11) 

In going from line 1 to 2, we have simply pulled the power 
pattern B to the other side of Fourier transform and ex- 
pressed the power response in u, v coordinates (the operators 
commute because the Fourier transform is unitary, and B;, 
becomes a convolution). The beam pattern is still baseline 
dependent in line 2, but the sampling function S is already 
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selecting out individual baselines to create visibility mea- 
surements, so these operators can be combined in the last 
line. 

In words, the input sky is transformed to u,v coordi- 
nates, then the appropriate region of the u, v plane is inte- 
grated to form the visibility measured by that pair of anten- 
nas using the unique power response of that antenna pair. It 
is interesting to note that Equation 11 is the algorithm used 
for simulating the response of interferometric observations 
by the MIT Array Performance Simulator (MAPS). 

Before moving to the analysis, a natural question is the 
size of the power pattern B in u,v coordinates — if it cov- 
ers a significant portion of the ii. i' plane it would remain 
computationally expensive. Each antenna has a response to 
the incident electric field Wa{0,9), where the response is 
complex to capture both the electric field gain and phase 
delay. We can transform the antenna response to u, v co- 
ordinates Wa (u, u) . This transformed antenna response is 
exactly what is obtained during a holographic measurement 
of the antenna gain (Scott & Ryle 1977). Because the elec- 
tric field outside the antenna physically cannot be added into 
the received signal, Wa(u,u) has the same size as the an- 
tenna.* The power pattern observed by a baseline is given by 
the multiplication of the constituent antenna sky responses, 
or the convolution of the w, v plane responses of antennas i 
and j 

B,,(u,u) = Wf(u,u)*W,(u,u). (12) 

Thus the power pattern B(u, w) is very compact in the u, v 
plane, approximately twice the physical width of an antenna. 
This compact feature is why it forms the basis of array sim- 
ulators. 

Moving to the analysis, we can again use OMM to form 
an optimal analysis approach: 

I{e) = D"^ F^(6», u)B^(u, v)N"^m(v). (13) 

In this analysis we have replaced the simple (5-function grid- 
ding of S^(u, v) with a gridding function B^(u,v) that 
spreads the visibility out on the u, v plane using the power 
pattern response of that particular antenna pair. Effectively 
the direction and baseline dependent instrumental calibra- 
tion has become part of the gridding kernel.^ Because this 
uses the holographic antenna response to perform the cali- 
bration, we call this analysis technique software holography. 

From a statistical viewpoint software holography can be 
understood as adding a Bayesian prior to the analysis. In the 
standard analysis (Equation 6) it is assumed that the anten- 
nas have uniform power responses across the image, and the 
calibration simply affects the amplitude and phase of the 
fringe measured by a single baseline. In the software holog- 
raphy analysis, we arc adding the prior that we know the 
direction-dependent gain of each antenna. With this prior, 
only the portions of the sky the given antenna pair were 
sensitive to should be reconstructed with a fringe — if the 

* Reflections from outside the antenna and widefield w-projection 
eff'ects can make tiic response slightly larger, but it remains very 
compact in the u, v plane. 

^ When using an FFT, an anti-aliasing filter must be added to the 
power response kernel using an additional convolution. This has 
no impact on the final precision if the effects of the anti-aliasing 
filter are included in the deconvolution. 



antennas could not respond to radiation from a portion of 
the sky, the measured visibility should not be interpreted as 
coming from that direction. This enveloping of the fringe by 
the baseline power response is shown graphically in Figure 
1. 

Not including the antenna-dependent holographic mea- 
surements (when known) is a form of bias. Incorporating 
these effects using software holography improves the preci- 
sion of high-dynamic range measurements for current ob- 
servations with the VLA and similar interferometers (Bhat- 
nagar et al. 2008), in addition to observations with next 
generation instruments. 



4 ATMOSPHERIC & IONOSPHERIC 
DISTORTIONS 

Lonsdale (2005) provides a review of atmospheric and iono- 
spheric distortions. In that work the effects of atmospheric 
disturbance are separated into four regimes, depending on 
the characteristics of the interferometer and the length scales 
of atmospheric disturbances. Briefly these regimes are: 

1) A narrow field of view and short baselines. Distortion 
appears as a translation of entire field, correctable with a 
tip-tilt compensation. 

2) A narrow field of view and long baselines. Independent 

phase delay for each antenna, but the delay applies to en- 
tire antenna field of view. Appears as scintillation, but same 
scintillation pattern for all sources in the field. Correctable 
with single calibrator adaptive optics and self-cal, and is 
typical for VLA observations. 

3) Short baselines but a wide field of view. Distortion is 
purely refractive, but varies across the field of view. Some- 
times described as a rubber-sheet distortion and is typical 
of MWA observations 

4) Wide field of view and long baselines. The worst case, 
as sources scintillate across the field of view with a position 
dependent scintillation screen. This is the challenge faced by 
the LWA and LOFAR at the longest baselines. 

Regimes 1 & 2 are described by the standard measure- 
ment description (Equation 5), and are well handled with 
current analysis algorithms. Regimes 3 & 4 cannot be de- 
scribed as a single phase delay per antenna, because the de- 
lay varies across the field of view. In the following sections 
we will explore how to use OMM to analyze observations in 
the challenging atmospheric conditions or regimes 3 & 4. 

4.1 Widefield Refractive Distortions 

In regime 3, atmospheric and ionospheric distortions ap- 
pear as a rubber-sheet distortion: the apparent positions of 
sources are shifted but they do not appear to scintillate. 
Mathematically this can be described by: 

m(v) = B(v, Ut)F(ut, e't)A{e'u 9; 1)1(6) + n(v'). (14) 

Here we have added a time-dependent atmospheric distor- 
tion A{6[,6;t) that moves the apparent locations of the 
sources seen by the array from 9 to 6' . The atmospheric dis- 
tortion presented here can also include position dependent 
absorption and Faraday rotation, as long as the distortion 
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Figure 1. This figure shows the contours from a dirty map for a single baseUne using software holography. For this example we have 
used an antenna consisting of a 4x4 array of dipoles with a spacing of nearly one wavelength so there are strong grating lobes, and a very 

widcficld image to show both the primary beam (center) and grating sidclobes (surrounding). This is similar to the beam patterns seen by 
LOFAR and MWA antennas near the top of their frequency bands. In the image the fringe from the single visibility is clearly seen as the 
diagonal corrugations, but its amplitude has been enveloped by the known antenna pattern and the sidclobes are clearly evident. While 
the beam pattern covers the sky, the corresponding convolution in the u, v plane is very compact. In traditional intcrferomctric analysis, 
the corrugations would have the same amplitude across the image, as the prior of the antemia pattern is not used. In software holography 
the enveloping power pattern can vary from baseline-to-baseline to accurately represent the directional sensitivity of individual antenna 
pairs. Not shown here is the direction-dependent shifting of the fringe peaks which can be produced by directional differences in the 
phase delays of the antennas. 



is the same for all antennas in the array. Because the atmo- 
spheric distortion is time-dependent, the coordinate map- 
ping from the real sky to the apparent sky changes. This 
means that the associated u, v coordinates and visibilities 
are also time dependent as indicated. We have also included 
baseline dependent antenna calibration (B) in this example 
to illustrate how effects can be stacked. 
The corresponding analysis is 

m=-D-^ A^(^,ei;t)F^(0^,ut)B^(ut,v)N-im(v). 

(15) 

Through the Fourier transform, this is identical to the anal- 
ysis in the previous section. However, because of the time- 
dependent nature of the atmospheric distortion we can only 
grid and Fourier transform visibilities from one atmospheric 
realization. Effectively we are creating instrumentally cali- 
brated snapshot images of the apparent sky, which we then 
correct with a rubber-sheet correction so sources appear in 
their true locations (9), and then stack the snapshot im- 



ages. The time scale of the snapshot images is set by the 
atmospheric distortions. This analysis is effectively the ap- 
proach taken by the MWA (Mitchell et al. 2007), and the 
ionospheric distortion timescale sets the 8 second snapshot 
cadence of the instrument. 

It is possible to Fourier transform the atmospheric op- 
erator and describe it in the u, v plane, in analogy to what 
we did with the antenna power response in Section 3.1. We 
will look at this approach in the context of the next section, 
but the snapshot imaging appears to be a better solution 
for most interferometers operating in the rubber-sheet con- 
ditions considered here. 



4.2 Widefield Scintillating Distortions 

Scintillating widefield distortions of regime 4 are the most 
difficult, because each antenna sees a different atmospheric 
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phase screen. This is the analysis challenge faced by LOFAR 
and the LWA. 

To describe the widefield scintillation seen in regime 
4 we need to make our atmosphere model more general. 
The wavcfront observed by an individual antenna can be 
described as a direction-dependent phase delay La(&,0;t) 
(labeled L in sympathy with the challenges faced by LOFAR 
and the LWA). The atmospheric distortion seen by a baseline 
is then given by 

Aijie,e) = Llie,e)-Ljie,e), (i6) 

Aij(u,u) = Lf(u,u)*Lj(u,u). (17) 

Effectively A[, is describing the distortion to the fringe pat- 
tern for that bEiseline. Under the regime 4 atmospheric con- 
ditions the fringe pattern for a single visibility will not ap- 
pear as a simple grating across the field, but more like the 
lines on a topographic map as the position dependent iono- 
spheric delay shifts and distorts the locations of the fringe 
peaks. 

We can use Equations 16 and 17 to describe the mea- 
surement as 

m(v) = B(v,ut,6)F(u(,i„6>;,i,)Ai,(6>;,(„6>;t)/(6') -Mi(v'). 

(18) 

Not only is the distortion time variable, it is different for 
every baseline. This leads to an analysis of the form 

i{e) = A^(6i, dt.b-, i)F^(et,6, ut.i,)B^(ut.6, v)N-'m(v). 

(19) 

This analysis is very computationally expensive. In words, 
we must Fourier transform each calibrated visibility to cre- 
ate a snapshot image of that fringe, which is then corrected 
by the baseline dependent atmospheric operator A^. The 
individual fringes are then co-added to form a snapshot im- 
age, and successive images are stacked to form an integrated 
dirty map. 

As an alternative, wo could Fourier transform the at- 
mospheric distortion and pull it to the left of the Fourier 
transform in analogy to Equation 11 to obtain 

m(v) = [B(v, ut.b) A(ut,6, u; t)] F(u, 6)1(6) + n(v'). (20) 

As implied by the square brackets, we can now envision com- 
bining the position dependent atmospheric and instrumental 
distortions into a single u, v plane operation. This would give 
us the the analysis procedure 

7(6>) = D"^ F^(6l,u) [a'^(u,U6)B^(u6,v)] N-V(v). 

(21) 

At first glance this appears to be the obvious solution: cor- 
rect both the scintillating ionosphere and instrumental gain 
in one baseline dependent gridding step. The difficulty is 
that unlike the beam response, the atmospheric distortion 
is not well localized in the u,v plane (Matejek et al. 2007). 

The difference in the locality of the instrumental and 
atmospheric terms is related to the physics of the two dis- 
tortions. The electric field response W of an antenna is fun- 
damentally the sum of the electric field collected at each lo- 
cation on the antenna (W = W1+W2 + ■■■■)■ These terms are 
complex, and if they are added out of phase they interfere 
and destroy the response of the antenna. The atmospheric 
delay L on the other hand is multiplicative: if we decompose 
the atmosphere into many terms each rotates the phase by 
a certain delay angle d. Mathematically this ends up with 



a form of g2T»{di-Hd2-H...)^ Y)ne to the multiplicative nature, 
this is not compact in the u, v plane and strong beating ef- 
fects between atmospheric modes come into play — the spa- 
tial equivalent of intermodulation distortion. We refer the 
interested reader to Matejek ct al. (2007). 

In conclusion, if the ionospheric phase screen is just 
a little more complicated than a single per-antenna delay 
(regime 2) but cam be described with only a couple of large 
sinusoidal modes for each antenna (barely into regime 4), 
the u, V plane atmospheric correction described in Equation 
21 may be useful. However, for complex direction and an- 
tenna dependent atmospheres the u, v size of the correc- 
tion becomes enormous. Under these most challenging con- 
ditions one would be best served with the conceptually sim- 
pler snapshot fringe approach of Equation 19. 



5 DISCUSSION 

To fully integrate software holography into modern intorfcr- 
ometric data analysis there are a few loose ends we should 
tie up, including projection effects and a comparison with 
multi-faceting techniques. 

So far we have used a flat sky (6) and simple Fourier 
relationship to simplify the notation and help focus on the 
unique characteristics of software holography. In general this 
is a poor assumption, particularly in the context of widefield 
atmospheric distortions. Fortunately widefield/w-projection 
effects can be easily added. 

Returning to basics and following the discussion in lec- 
ture 1 of Synthesis Imaging in Radio Astronomy (Clark 
1999), the general spatial correlation relationship can be 
written as (their Equation 1-5) 

V(m, V, w) = C{{u, V, w}, s)/(s), (22) 

where 

C({m, V, w}, s) = J e-^"*""-"'/=d^s (23) 

and w — {u,v,w} — ri — r2. Following Cornwell et al. 
(2003, 2008), in {u, v, w} coordinates the correlation relation 
C can be decomposed into a Fourier transform, a coordinate 
conversion T from s — »■ {l,m}, and an additional term H 

(their Equation 10) 

C = ¥{{u,v},{l,m})ll{{l,m,w},{l,m})T{{l,m},s), 

(24) 

where H = ^-2^ilw(y^l-l2-ra^-l)] _ rpjjjg gjygg ^jjg 

standard limiting cases: 

• If the field of view is small, H is negligible and can be 
ignored. 

• If the array is coplanar, H can be kept negligi- 
ble at the cost of a time-dependent coordinate conver- 
sion T({/', m'}, s). For observatories that contend with a 
direction-dependent atmospheric refraction (§4.1), this co- 
ordinate transformation can be combined with the time- 
dependent atmospheric distortion A{6',9;t) and corrected 
at no additional cost. 

• In the most general case, we can follow the w-projection 
technique developed by Cornwell et al. (2003, 2008). This is 
equivalent to pulling H through the Fourier transform to 
create a widefield u, v correction H(u, u; w, t). 
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Any of these limits can be added to the analyses developed 
in this paper by inserting the appropriate operators. 

As an aside, Cornwell et al. worked around the pole in 
H(u, u) by using the anti-aliasing filter in the gridding step, 
even though this filter is only needed for Fast Fourier Trans- 
forms. In the software-holography context this can be more 
physically interpreted as a convolution with the holographic 
beam pattern B(u, u). Since any real antenna has a non- 
zero size, the pole in H naturally goes away for any physical 
system. 

All of the analysis issues approached in this paper have 
been successfully tackled in the past with mosaic imaging 
and multi-faceting deconvolution techniques of Cornwell & 
Pcrlcy (1992) and Sault et al. (1996). These techniques sub- 
divide the ficld-of-view and apply a separate complex gain 
calibration to each facet, allowing the atmospheric and in- 
strumental calibration to change across the field in a step- 
wise fashion. Faceting has the definite advantage of being 
a proven technique behind many astronomy results, but for 
next generation arrays its requirements on data storage and 
computational efficiency may make software holography an 
attractive alternative. 

The computational requirements of faceting are driven 
by how faceting is integrated into the deconvolution process. 
When a faceted dirty image is produced, the facet edges 
distort the apparent array beam of sources in neighboring 
facets. This is typically dealt with by using the dirty map 
only as an intermediate step in the deconvolution process — 
visibilities are used to make a faceted map, sources are iden- 
tified and subtracted from the raw visibility data, and a new 
faceted dirty map is created for the subsequent iteration of 
the deconvolution algorithm. The deconvolution algorithm 
must always work on the full visibility dataset, as the dirty 
image contains artifacts which cannot be easily removed. 

Bhatnagar et al. (2008) have recently demonstrated 
the use of software holography within a traditional non- 
linear deconvolution algorithm. Their algorithm is signifi- 
cantly faster than faceting and does not suffer the discon- 
tinuities and artifacts in the intermediate image. However, 
their algorithm still subtracts the sky model from the raw 
visibilities, using the dirty map only as an intermediate step 
in the deconvolution process. 

For next generation radio arrays with hundreds of ele- 
ments and wide fields of view the raw visibility data can be 
very large: for example the correlated data rate for the MWA 
is ~19 GBytes/s over just 31 MHz of bandwidth. The op- 
timal map making technique was developed in response to 
these same computational problems as faced by the CMB 
community (Tegmark 1997b). The time series data from a 
satellite such as WMAP and Plank is analogous to interfer- 
ometric visibilities and similarly voluminous, and deconvo- 
lution of the time series data quickly becomes computation- 
ally impractical. OMM allows all of the information in the 
raw data measurements to be preserved in the intermediate 
map, reducing both storage needs and the computational 
requirements of deconvolution. The precision of CMB mea- 
surements is a testament to the power of the optimal map 
making formalism. 

In intcrfcromctric software holography, forming the 
intermediate map is computationally very efficient. The 
direction-dependent gain of each antenna can be corrected 
by gridding with the baseline dependent u, v power pattern. 



This gridding kernel is very compact, leading to a efficient 
imaging algorithm. The atmospheric corrections arc less ef- 
ficient than the instrumental calibration, but the final map 
has no significant artifacts. Thus the deconvolution algo- 
rithms can work directly on the lossless 'dirty map' formed 
by software holography, without referring to the raw visibil- 
ities. Deconvolving intermediate maps should be no slower 
than deconvolving the raw visibilities (while requiring much 
less storage space), and potentially could be much faster 
if techniques from the CMB community can be effectively 
used. 

It is hoped the software holography techniques pre- 
sented in this paper will assist the development of analy- 
sis systems for next generation instrumentation, and enable 
precision interferometric measurements such as power spec- 
trum detection of 21 cm emission from the Epoch of Reion- 
ization. 
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APPENDIX A: TRANSLATING TO INTEGRAL 
NOTATION 

Linear algebra notation is very efficient for tlie describing 
measurements and the related optimal analysis procedures. 
However, much of the interferometry literature is written in 
integral notation. To aid translation, the key operators in 
this paper arc listed in Table Al with their integral equiva- 
lents and a brief description of each mathematical operation. 

As a simple example of how to convert a linear algebra 
equation into its integral equivalent, the basic measurement 
description in Equation 5 



m(v) = G(v, v)S(v, u)F(u, 61)7(61) + n(v) 



(Al) 



can be rewritten using Table Al to create 
mv=gv J S{u - Ub) J e-^'''"-^7(6l) d'e 



d u + Uv. (A2) 



In creating Equation A2 each integral from Table Al en- 
closes all of the expressions which appear to the right in the 
linear algebra version. Using this procedure all of the equa- 
tions in this paper can be translated into integral equiva- 
lents. 



APPENDIX B: BRIEF INTRODUCTION TO 
LINEAR ALGEBRA 

The following is designed as a brief refresher of key linear 
algebra concepts. 

In linear algebra, an operator O transforms an input 
vector into a new vector. For many cases, this becomes a 
generalized change of variables where the result is in a dif- 
ferent coordinate set than the input. 



d(a;) = 0{x,y;z)v{y) 



(Bl) 



In this notation the operator 0{x,y;z) depends on param- 
eters z and transforms the input vector v in coordinates y 
to the result d in a new set of coordinates x. This works 
perfectly well for continuous coordinates as well as discreet, 
we just have to imagine breaking the continuous coordinates 
into very small "pixels." 

For a concrete example, let's imagine calculating the 
sky temperature T for a set of antennas a, given a position 
dependent gain G{0) and a distribution of sources 1(9). In 
the traditional integral formulation the sky temperature of 
each antenna is given by 



: J Ga{e)i{e)d^e. 



This can be as easily written in operator notation as 
T(a) = G(a,e;p)/(e), 



(B2) 



(B3) 



where p is the parameters the antenna gain depends on. Here 
the operator G is performing the integration and explicitly 
going from the continuous sky coordinates 6 to the discreet 
per-antenna coordinates a. One advantage of the operator 
notation is adding polarization information and Jones ma- 
trices is notationally straightforward, we can just add more 
pixels to 6, one for each polarization. Note that because the 
operator G includes the integration, G and G do not have 
the same units (G is unitless and G has units of str or m~^). 



There are a couple of common misconceptions which 
can make reading linear algebra equations more difficult. 

The first is the difference between the conjugate trans- 
pose and inverse of a matrix operator. The conjugate trans- 
pose of an operator simply reverses the rows and columns 
(and takes conjugate if complex), and serves to reverse the 
direction of the coordinate transformation. So for our ex- 
ample above we can easily create G^(^,a;p) by reversing 
the order of the indices and conjugating, and perform the 
following calculation 



r(e) = G^(0,a;p)T(a), 



(B4) 



to move from antenna temperatures back to sky coordinates. 
However, I' (6) is not equal to the true sky brightness dis- 
tribution 1(9) in Equation B3 . While we are back in the 
correct coordinates, we have not recreated the distribution 
of sources in the sky. An antenna temperature does not al- 
low reconstruction of the sky distribution, because of the 
spatial integral in Equations B2 & B3 has erased this infor- 
mation. Additionally, while I' is in the same coordinates 9 as 
the original input vector 7, it is not necessarily in the same 
units (units refers to the value scale at a location whereas 
coordinates refers to the scale of the map axes) . In our exam- 
ple here the operators G and G^ both have the same units, 
not inverse units, so I' {9) does not have the same units as 
I{9) even though the coordinate scales of the images are the 
same. 

To really go back to the input signal we would need to 
know the inverse operator G~^(9,a:,p), and as in most cases 
the inverse operator does not exist for our example due to 
the sky integral. We can go back to the original coordinates 
using the transpose, but we will not obtain the same result 
we started with unless the operator is unitary. 

The second common misconception is to think of the sky 
I{9) as a matrix-like quantity. While it is tempting to think 
of the sky as a two dimensional grid, in linear algebra each 
of the pixels is a separate entry in a long one-dimensional 
vector. Thinking of the input as a matrix breaks the linear 
algebra notation and leads to artificial constraints (effec- 
tively requiring the operators to be diagonal). A corollary of 
this is that the matrix operators are formally very large; if 
our input vector has N^ix, the operator is Npix x Npix in size. 
But this is a formal definition and does not mean that we 
have Npix operations to perform in most cases. The amount 
of computation comes down to how many of the entries in 
the operator are populated — the sparscness of the opera- 
tor. The number of populated entries in the operator rows 
directly determines the number of computations required, 
and in most of our examples this number is quite small. 

An illustrative example is the refractive ionospheric dis- 
tortion operator A(S', 9), which shifts the flux from the true 
sky location 9 to an apparent location 9' . If there is no scin- 
tillation, this is a one-to-one mapping described by a delta- 
function 5{9 — 9' — d9{9, t)). where d9 is the refractive shift. 
The entries in the A{9',9) operator consists of ones to map 
the flux from the input pixel to the output pixel, with a sin- 
gle entry per row. The operator is very sparse and easy to 
apply computationally, however because it is not diagonal 
it cannot be formulated if the input sky is thought of as a 
matrix-like quantity. 
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Table Al. This table lists the operators and vectors used in this paper, along with the integral formulation and comments on the 
operation. 

Linear Al,i;;('l)ra liilt\i^ial Nolaiiiiii ( 'oiiuaenl s 

u u or {u,v} u,v coordinates. In this paper we condense this to a single two dimensional 

vector u to make the notation more comijact and avoid confusion with visibil- 
ities. 

V V 'visibility' coordinates, or a vector listing the visibilities. In integral notation 

usually indicated as a subscript. 



n(v) 








Thermal noise per visibility. The matrix N is formed by the outer ijroduct of 
two vectors of the thermal noise, and allows correlated noise to be included in 
the linear algebra notation (e.g. cable cross-talk). 


m(v) 


my or 






A vector OT Tneasinrpd visibilities TTsiiflllv pvnrpsspd with siibscriTitsi in iTitpcral 
Ti(-»tntinn 


1(0) 


m 






The true sky brightness distribution. Note that in the linear algebra notation 

Llllk!) 1.5 Oi Vc^^UOl Oi i^J^y LlillIJS.lilti KjL Uillo CtiO CL itWKj Lllllieii&ltJliCLl llid 

trix' of values breaks the linear algebra notation (requires all operators to be 

diagonal). 


F(u,6l) 






Fourier transform. May be replaced with an FFT with the addition of an anti- 










aliasing filter. 


S(v,u) 


/6(u- 


- Uj,)d^u 




Sampling function which selects the locations in the u, f-plane which are mea- 
sured by an interferometric baseline b to create a visibility. In the linear algebra 
notation the result is a vector of the visibilities (v) . 


G(v,v) 


9v 






A single complex gain per visibility. The matrix version has entries only along 


B{e,e) 


B{0) 






The power response of a pair of antennas. As both the gain and phase may 
change with direction for each antenna, this is a complex function. The B 

operator is diagonal. 


B(u, u) 


B{u)* 


or/B(u'- 




Power response of a pair of antennas in u, v (u) coordinates. The Fourier trans- 
form of B{0) and includes the convolution created by the translating the mul- 
tiplication in 6 coordinates u. Due to the physics of antennas the B operator 
is sparse (of limited extent in u). 


B(v,u) 


f f 5(u' - u'')Bb{u' - ujd'^u d^u' 


Power response of the antennas in a particular baseline. Effectively this is a 










convolution over the sky in u, v coordinates combined with a delta-function 










to select the baseline sampled by that antenna pair. The result is a vector of 










visibilities. 


Wi(u,u) 




* or JWi(u' 


- u)d?u 


The electric field response of an antenna in u, v coordinates. This is the 
holographic antenna pattern, and is the Fourier transform of the direction- 






w{-\/l — l^—m^- 




dependent gain Wi{d). 


H({i,m, w},{l,m,}) 


g-27ri[' 


-1)] 


w-projection. For non-coplanar baselines in the narrow field limit can be inter- 








preted as Presnel diffraction (Cornwell et al. 2008) 










Deconvolution. There are many styles of deconvolution, many of which are 

non-linear (cannot be expressed as an integral or linear algebra equation). Any 
kind of deconvolution can be used with the results presented in this paper. 



